Single-Cell Profiling of Latently SIV-Infected CD4+ T Cells Directly Ex Vivo to Reveal Host Factors Supporting Reservoir Persistence

ABSTRACT HIV-1 cure strategies aiming to eliminate persistent infected cell reservoirs are hampered by a poor understanding of cells harboring viral DNA in vivo. We describe a novel method to identify, enumerate, and characterize in detail individual cells infected in vivo using a combination of single-cell multiplexed assays for integrated proviral DNA, quantitative viral and host gene expression, and quantitative surface protein expression without any in vitro manipulation. Latently infected CD4+ T cells, defined as harboring integrated provirus in the absence of spliced viral mRNA, were identified from macaque lymph nodes during acute, chronic, and combination antiretroviral therapy (cART)-suppressed simian immunodeficiency virus (SIV) infection. Latently infected CD4+ T cells were most abundant during acute SIV (~8% of memory CD4+ T cells) and persisted in chronic and cART-suppressed infection. Productively infected cells actively transcribing viral mRNA, by contrast, were much more labile and declined substantially between acute and chronic or cART-suppressed infection. Expression of most surface proteins and host genes was similar between latently infected cells and uninfected cells. Elevated FLIP mRNA and surface CD3 expression among latently infected cells suggest increased survival potential and capacity to respond to T cell receptor stimulation. These findings point to a large pool of latently infected CD4+ T cells established very early in acute infection and upregulated host factors that may facilitate their persistence in vivo, both of which pose potential challenges to eliminating HIV-1 reservoirs. IMPORTANCE Effective combination antiretroviral therapy controls HIV-1 infection but fails to eliminate latent viral reservoirs that give rise to viremia upon treatment interruption. Strategies to eradicate latently infected cells require a better understanding of their biology and distinguishing features to promote their elimination. Tools for studying these cells from patients are currently limited. Here, we developed a single-cell method to identify cells latently infected in vivo and to characterize these cells for expression of surface proteins and host genes without in vitro manipulation, capturing their in vivo state from SIV-infected macaques. Host factors involved in cell survival and proliferation were upregulated in latently infected cells, which were abundant in the earliest stages of acute infection. These studies provide insight into the basic biology of latently infected cells as well as potential mechanisms underlying the persistence of HIV-1/SIV reservoirs to inform development of novel HIV-1 cure strategies.

DNA (ipDNA) and highly multiplexed reverse transcriptase quantitative PCR (RT-qPCR) detection of viral and host RNA. Latently infected cells, defined as cells containing ipDNA and unspliced viral RNA below levels seen in productive infection, and lacking multiply spliced viral RNA (tat/rev), were identified in macaque lymph nodes spanning acute, chronic, and cART-suppressed simian immunodeficiency virus (SIV) infection. The frequency of latently infected memory CD4 T cells was greatest in acute infection and lowest following cART suppression. Unlike replication-active cells, latently infected cells were double-positive for surface CD3 and CD4 and lacked upregulation of activation markers. Analysis of host gene mRNA levels revealed increased expression of the prosurvival factor FLIP in latently infected cells in both acute and chronic infection. Overall, latently infected and uninfected memory CD4 T cells exhibited similar profiles, though differences noted in both host gene and protein expression demonstrate the power and sensitivity of this approach to distinguish novel features of infected cells directly ex vivo and elucidate potential mechanisms of their persistence.

RESULTS
Sensitivity and specificity of assay to identify latently infected single cells. To develop a method able to identify and characterize individual latently infected cells directly ex vivo, we modified our tSCEPTRE approach by incorporating a nuclear lysis step and adding qPCR detection of integrated provirus (Fig. 1A) (47)(48)(49). As in tSCEPTRE, cells were sorted by flow cytometry at one cell per well into a reaction mix for cell lysis, RNA reverse transcription using gene-specific primers including viral mRNA (e.g., gag and tat/rev), and cDNA preamplification followed by quantification by qPCR (48) (Fig. 1B). To harvest ipDNA, the nuclear envelope was lysed by proteinase K following three rounds of cDNA preamplification, and Alu-long terminal repeat (LTR) primers were then used to preamplify ipDNA during the remaining cDNA preamplification cycles. The infection status of each cell was determined by the combination of viral genes present in the resulting cDNA and genomic DNA (Fig. 1C). We hypothesized that latently infected cells would be distinguished by the presence of ipDNA and the absence of viral gene expression (e.g., multiply spliced tat/rev mRNA).
Since SIV infection in nonhuman primate models recapitulates many aspects of latent HIV-1 infection in humans (50)(51)(52)(53), including host cell tropism, viral persistence during cART, and much of the viral life cycle, we developed our approach for the study of latent SIV infection given the large number of intact proviral genomes in this model. Candidate qPCR assays specific for integrated SIV provirus were designed based on a previously described HIV-1 Alu-LTR assay widely used for quantification of ipDNA in bulk cells (38,39). To multiplex this assay with other viral assays and minimize potential primer interactions between assays, Alu-LTR was selected over Alu-gag (38,54,55), as the gag region was used for detecting 59 DNA and unspliced viral RNA. A downselected Alu-LTR assay was validated to ensure signal specificity, linear amplification, sensitivity at a single-cell level, and lack of interference with other viral (RT-)qPCR assays (Fig. 2). First, amplification of the 59 LTR was chosen over the 39 LTR based on comparison of Alu-LTR qPCR signal obtained with LTR primers oriented 59 or 39 of the provirus, respectively, and in the presence and absence of Alu and gag primers, using the 3D8 cell line containing one copy of integrated SIVmac316 as the template ( Fig. 2A) (55). Both the 59 and 39 LTR assays exhibited robust increases with Alu priming when the gag assay was excluded, demonstrating Alu-dependent amplification. Low-level background Alu-LTR qPCR signal was observed in the absence of Alu primer, as expected, due to unidirectional DNA synthesis from the LTR primer (38,39,54). When the gag primers were included to simulate the multiplexed RT-qPCR conditions, Alu-specific priming was maintained for the 59 LTR assay but not the 39 LTR assay (likely due to production of LTR-gag amplicons). The Alu-specific 59 LTR ipDNA signal was ;256-fold higher (i.e., ;8 E T , where E T = 40 2 CT and CT is the cycle threshold; E T , expression threshold) than the background observed with the 59 LTR primer alone.
To validate the assay for use in single-cell applications, we measured performance on limiting 3D8 cell template. Using serial dilutions of bulk 3D8 cell DNA, Alu-LTR amplification efficiency was high (90.5%) and reproducible (Fig. 2B, slope 23.57), but sensitivity below 10 cells was poor using extracted DNA template. To improve recovery at low cell numbers, 3D8 cells were sorted by flow cytometry at dilutions of 1 to 30 cells per well and processed as shown in Fig. 1A. Again, ipDNA qPCR exhibited linear amplification (Fig. 2C, slope 23.56). Background in the absence of Alu primer was again; 8 E T lower. Sensitivity within single cells was robust: integrated provirus was detected in 18 of 19 (95%) replicate single cells, as identified by SIV gag positivity, which served as a control for efficient nucleic acid recovery (Fig. 2D). Nuclear lysis was required for Alu-LTR amplification as signal approached background (no Alu) in the Single memory CD4 T cells from cryopreserved specimens are sorted by flow cytometry into 96-well PCR plates containing reverse transcription-preamplification reaction components and gene-specific primers (host and viral). RNA reverse transcription (RT), brief cDNA preamplification (3 cycles), nuclear lysis via subsequent proteinase K addition, and further preamplification (15 cycles) are followed by qPCR. (B) qPCR assays used to classify cellular infection stages. SIV genome integrated in rhesus macaque DNA is shown with individual genes relative to the nearest Alu repeat. Assays for viral RNA, tat/rev and gag, and primer positions are shown below. Assays for viral DNA are shown above. For Alu-LTR assay, green dashed line indicates integrationspecific amplification during preamplification stage. Green arrowheads above indicate primer positions for the second PCR-qPCR, not specific for ipDNA. (C) Infected cell classification scheme. The preamplified cDNA and ipDNA mixture is analyzed by qPCR for the presence of viral nucleic acids: Alu-LTR versus tat/rev (top), Alu-LTR versus gag (bottom), and others as shown in panel B. Cells are classified into the indicated infection status by the presence and relative amount of viral nucleic acids, as indicated in the table in the middle of the panel. Gene expression data (bottom, right; GOI, gene of interest) are combined with FACS data (top, right; POI, protein of interest) obtained during cell sorting indexed by well location. spec., specificity; uvRNA, unspliced viral RNA; gDNA, genomic DNA; ipDNA, integrated proviral DNA.
Latent SIV-Infected CD4 1 T Cell Profiling Microbiology Spectrum absence of proteinase K (Fig. 2E). Median E T difference between conditions with and without Alu primers was 10 (2 10 copies). ipDNA E T variability increased with cell dilution, likely due to stochastic detection at low-copy conditions, and spanned ;5 CT for 1-cell samples. The lowest positive single-cell Alu-LTR E T value was 12, suggesting that an E T value of $12 serves as a threshold for identifying provirus-positive cells. We validated this threshold using memory CD4 T cells from a chronically SIV-infected rhesus macaque lymph node. Background E T values did not exceed 10 in the absence of Alu primers, supporting an Alu-LTR E T value of $12 as an appropriate cutoff in ex vivo specimens (Fig. 2F). Taken together, these data indicate that the method is both sensitive and specific for SIV-infected cells harboring ipDNA and therefore appropriate for quantitative and qualitative assessments. Detection of latent SIV-infected rhesus cells during acute, chronic, and cARTsuppressed infection. To study latently infected cells directly ex vivo with single-cell resolution using this method, memory CD4 T cells were isolated by fluorescence-activated cell sorting (FACS) from four SIV-infected rhesus macaque lymph node specimens. Since SIV downregulates both CD3 and CD4 from the surface of infected T cells (56,57), cells lacking CD3 and CD4 were included in the gating strategy (Fig. 3A). To distinguish latently infected cells from productively infected cells, both of which contain ipDNA, the initial RT-preamplification reaction included primers for spliced (tat/rev) and unspliced (gag) viral RNA for downstream quantitation by qPCR. Transcriptionally silent latently infected cells are expected to be negative for tat/rev and low for gag. In addition, to  Fig. 1, including (green) or omitting (gray) Alu primers with or without the gag assay. LTR primers annealing to the same sequences within 59 or 39 LTR, both extending outward of the viral DNA, were compared. (B) Qualification of Alu-LTR using DNA dilution series. DNA from cell line 3D8 was extracted, serially diluted, and used in protocol described in Fig. 1, including or omitting Alu primers. Cell number equivalents to template DNA mass are plotted on the x axis, assuming 100% efficient cell and nuclear lysis. Linear regression slope and R 2 value for "1Alu" condition are indicated. (C) Qualification of Alu-LTR using sorted cells. 3D8 cells sorted by flow cytometry directly into reaction mix at the indicated cell dilutions were assessed for integrated SIV DNA by Alu-LTR qPCR as described in Fig. 1 in the presence or absence of Alu primers. (D) Sensitivity and specificity of the Alu-LTR assay at the single-cell level. Alu-LTR E T values obtained for single 3D8 cells described in panel C are plotted against gag E T values. Horizontal line at E T = 12 demarcates threshold for Alu primer-dependent signal, indicating ipDNA. (E) Alu-LTR signal achieved with and without nuclear lysis. Live 3D8 cells were processed as in panel C, including or omitting nuclear lysis with proteinase K. (F) Specificity of Alu-LTR assay in primary CD4 T cells latently infected in vivo. Single cells from a chronically SIV-infected rhesus macaque lymph node were processed as described in Fig. 1 and in the presence (n = 592) or absence (n = 176) of Alu primers. Annotation at right indicates integrated DNA classification status applied to macaque samples. assess provirus "intactness," the preamplified DNA was evaluated by a Rev response element (RRE)-specific qPCR assay targeting the 39 half of the genome (Fig. 1B), where large deletions are the most commonly occurring proviral mutation (58). Given the robust viral replication and relatively abundant cellular infection during acute infection, we analyzed a specimen collected at approximately peak viremia to facilitate identification of infected cells, including differentiation of latently infected cells from transcriptionally active cells, which we refer to as productively infected. Memory CD4 T cells (n = 768) from a mesenteric lymph node biopsy specimen of an SIVmac251-infected macaque (day 10 postinfection) were sorted and processed as described in Fig. 1A. Single-cell qPCR for the four viral gene assays (tat/rev, gag, RRE, Alu-LTR) revealed distinct infected cell populations, defined by the quantitative signals from each assay (Fig. 3B, "acute"). These clusters were interpreted to represent cells in discrete infection stages as follows: (i) productive or replication-active infection, as defined by tat/rev expression, which generally coincided with both high unspliced genomic RNA content (gag copies per cell . ;10 3 ) and robust Alu-LTR signal, (ii) early, abortive, or defective provirus infection, as defined by undetected Alu-LTR and tat/rev, but intermediate levels of genomic RNA (0 , gag copies per cell , ;10 3 ), (iii) latent infection, as defined by an Alu-LTR E T value of .12, lack of tat/ rev, and low or intermediate genomic RNA (gag copies per cell , ;10 3 ), and (iv) uninfected, as defined by negative qPCR for all SIV assays. Cells that did not satisfy any of the above criteria were deemed ambiguous and excluded from further analysis. Among productively and latently infected cells, positivity for both gag and RRE, representing 59 and 39 DNA, respectively, was used to define intact provirus.
Using these criteria, replication-active SIV-infected cells represented 13% of memory CD4 T cells (n = 103) in this lymph node specimen, consistent with prior estimation using an independent limiting dilution method (47,48) (Fig. 3C). Latently infected cells comprised 8% (n = 61), while early/abortive/defective infection was most abundant at 18% (n = 142). Thus, in total, ;40% of memory CD4 1 T cells were infected in this secondary lymphoid tissue during early acute infection. The majority of productively and latently infected cells were also positive for RRE and gag, indicating intact 59 and 39 regions of the provirus (Fig. 3C).
To further confirm the presence of integrated provirus in cells identified as productively or latently infected, we sequenced the integration site Alu-LTR PCR amplicon for ;30 single cells. Sequencing reads aligning to both Macaca mulatta and SIVmac251 spanning the presumed integration site were obtained for 5 cells, all with relatively high Alu-LTR E T values (Table 1; poor quality sequence data limited integration site identification for the other 25 cells). Three of these cells contained one integration site each, while two integration sites were identified per cell for two cells (one productive and one latent). Integration sites were identified in introns of host genes ARHGAP45, RBM33, MOB3A, and AKTIP and in intergenic regions, consistent with SIV and HIV-1 integration sites located both inside and outside genes (59). Of note, RBM33 was previously identified as an integration hot spot for HIV-1 (60). These findings support the presence of integrated provirus in cells identified by our singlecell Alu-LTR qPCR method.
We next examined cells from chronic infection when the dynamics between latent and productive cellular infection are expected to differ from the acute stage due to more limited virus replication. Lymph node memory CD4 T cells from two chronically SIV-infected macaques (7 months postinfection) were sorted and evaluated as described above (n = 1,728 cells). Overall, fewer infected cells were observed during Lastly, we measured infected cells during suppressive cART, which is expected to further limit viral replication and productive cellular infection. Lymph node memory CD4 T cells were analyzed in an animal infected with SIVmac251 that initiated cART 8 weeks postinfection, with suppressed viremia for 5 weeks at the time of biopsy (15 weeks postinfection). Due to the much lower frequency of infected cells in this setting relative to that in acute and untreated chronic infection (61), we identified cell dilutions containing one infected cell per well by replicate limiting dilution cell sorting and Poisson distribution probability estimation. These multicell wells were used for downstream analysis in lieu of single-cell wells for this specimen. For example, ,33% of replicate 100-cell wells were positive for any viral gene assay; these wells were therefore deemed unlikely to contain more than one infected cell and represent single infected cells for the purpose of viral gene profiling. As expected, productively SIV-infected cells were extremely rare, estimated at 0.004% of memory CD4 T cells (Fig. 3C). Latently infected cells were more abundant, representing 0.3%. The proportion of provirus-intact latently infected cells (RRE 1 and gag 1 ) was lower in the cART setting than in acute and untreated chronic infection, comprising only 16% of all latently infected cells, indicating greater accumulation of proviral deletions in suppressed infection (53). Early/abortive infection was ;10-fold less common than latent infection (0.02%), consistent with cART blockade of new rounds of cellular infection.
Flow cytometric surface protein profiling of latently infected single cells. To identify host proteins differentially expressed on the surface of infected CD4 T cells, we merged the viral RNA/DNA PCR data with the protein expression data collected by FACS for each cell from all untreated animals. It was not possible to apply this approach to the cART-suppressed animal due to the presence of uninfected cells combined with each infected cell in the bulk cell collection. Cells were grouped by infection status as described above. We first examined CD3 and CD4 expression, as cells lacking these proteins potentially introduce non-CD4 1 T cell contaminants in the analysis. While most infected cells were double-positive for CD3 and CD4, many tat/rev 1 cells expressed low levels of these surface proteins (Fig. 4A). Given the well-established downmodulation of both CD3 and CD4 by SIV (56,62,63), we attributed the reduced expression of these markers on tat/rev 1 cells to viral protein expression rather than infection of cells other than CD4 1 T cells. Therefore, for comparison purposes, we determined CD3, CD4 double-positive cells to be the appropriate uninfected reference population and excluded cells negative for either CD3 or CD4 and tat/rev from the uninfected cell group from further analysis (Fig. 4A). For the purposes of differential protein and gene expression analysis, RRE detection was not used as a discriminating  factor. This approach maximized statistical power through inclusion of all cells containing integrated provirus. Expression of 17 surface proteins was compared among cells in each of the infection classifications during acute and chronic infection. Overall, these proteins, which included memory, activation, homing, and exhaustion markers, exhibited similar expression patterns between the uninfected and infected cell groups (Fig. 4B to D). However, several differences were noted in comparisons between uninfected cells and productively infected cells as well as between latently infected cells and the other cell populations. Relative to uninfected cells, productively infected cells expressed less surface CD3 and CD4, as noted above. They also expressed increased CD95 (47,48), activation-associated proteins HLA-DR, TIGIT, CD38, and CD69, and a reported latency marker, interferon-induced transmembrane protein 1 (IFITM1) (30). These data are consistent with preferential targeting of activated T cells by SIV (64), though activation was not required for productive infection as many cells were negative or dim for activation markers.
Latently infected cells were double-positive for CD3 and CD4 in both acute and chronic infection with no evidence of downmodulation ( Fig. 4A and B), consistent with limited viral protein expression in these cells and further supporting their latent classification. CD3 was expressed at higher levels on latently infected cells than on uninfected cells in the two chronically infected animals (median fluorescence intensity [MFI] of 1,000 versus 789 and 842 versus 632, adjusted P value [P adj ] of 0.03 and 0.08, respectively, Table 2). Expression of CD3 on latently infected cells did not differ in the acute infection setting. CD38, ICOS, and CCR6 were also upregulated on the surface of latently infected cells during chronic infection. Relative to that on early/abortively infected cells, expression of CD3, PD-1, CD38, ICOS, and CCR6 was greater on latently infected cells in chronic infection, though only CD3 was consistently elevated in both animals. Latently infected cells expressed lower levels of CD8, HLA-DR, IFITM1, and CXCR5 than early/abortively infected cells in the chronic setting, and HLA-DR was also lower during acute infection. Other markers assessed, CD28, MHC class I, CXCR3, and integrin b7, did not exhibit differential expression. Taken together, in this univariate analysis, latently infected cells displayed an activation and homing profile overall similar to those of uninfected cells and productively and early/abortively infected cells, though expression of some activation markers varied.
To identify potential combinations of surface proteins able to distinguish latently infected cells using a multivariate approach, we visualized the cells in two-dimensional space using t-distributed stochastic neighbor embedding (tSNE), a method for nonlinear dimensionality reduction. We analyzed a total of 1,265 uninfected, 258 early/abortively infected, 112 latently infected, and 106 productively infected cells combined from the three macaque lymph nodes sampled during acute and chronical infection described above. All 17 surface proteins specific for memory, activation, homing, and exhaustion markers were included in the analysis. Single-cell tSNE plots did not reveal infection status-specific clusters, as all of the infected cell groupings were heavily interspersed with one another and with uninfected cells (Fig. 5). The analysis was further refined by removing uninfected cells lacking CD3 and CD4, as described above in Fig. 4A. Uninfected cells and the infected cell groupings remained intermingled, providing additional evidence that phenotypic profiles for these markers are largely similar across these cellular infection states.
Differential host gene expression among SIV-infected CD4 + T cells. To explore Latent SIV-Infected CD4 1 T Cell Profiling Microbiology Spectrum host genes differentially expressed by infected cells, we interrogated single CD4 1 T cells for expression of host genes by RT-qPCR using the same cells isolated from the three untreated acute and chronic SIV-infected lymph node specimens. Again, the cART-suppressed animal was not included in this analysis due to the mixture of uninfected cells with each infected cell. In this proof-of-principle analysis, we selected four host genes, CD3, CD4, CD40L, and FLIP, for their high expression positivity rate among memory CD4 T cells by single-cell RT-qPCR (47,48) and to maximize statistical power.
CD40L is a marker of T cell activation, while FLIP is a prosurvival factor that inhibits both apoptotic and necrotic cell death pathways (65,66). In productively infected cells, expression of CD3, CD4, and CD40L increased by ;1.3-, 2-, and 2.5-fold, respectively, compared to that in uninfected cells (Fig. 6). Of note, CD3 and CD4 transcript levels were elevated in the context of decreased expression of these surface proteins (Fig. 4), consistent with posttranslational mechanisms of downmodulation by viral proteins. Early/abortively infected cells from acute infection also trended toward increased expression of these genes, supporting greater activation of these cells relative to that of uninfected cells. Among latently infected cells, FLIP was upregulated consistently across all three acute and chronic infection samples relative to uninfected cells (1.6-to 1.9-fold). Latent cell expression of FLIP also trended greater than that in productively infected and early/abortively infected cells in the acute infection setting (P adj = 0.10 and P adj = 0.11, respectively). Such host factors may increase the resistance of latent cells to homeostatic cell death processes. Neither CD3 nor CD40L differed between latent and uninfected cells, while CD4 was elevated 1.8-and 6.5-fold in the acute infection samples and in one chronic infection sample, respectively. The lack of differential CD3 gene expression suggests that increased surface CD3 protein expression by latently infected cells occurs via a posttranscriptional mechanism. These data demonstrate feasibility of single-cell host gene expression profiling of latently infected cells directly ex vivo, including identification of host factors that may play a role in promoting their persistence.

DISCUSSION
Understanding unique features of latently infected cells will inform strategies to eliminate or modulate HIV-1 reservoirs. Here, we developed a novel single-cell approach to identify and characterize primary cells latently infected in vivo by combining flow cytometry and multiplexed RT-qPCR. Unlike many latency models, this approach includes no  infection macaque lymph node specimens as described above were analyzed. Cell selection criteria were applied as shown in Fig. 4A. Host gene expression is plotted (E T = C max 2 CT, where C max is maximum concentration of drug in serum) for the indicated genes. Lines between violin plots connect mean values. Significant differences between the latently infected cell population and the other cell groups as well as productively infected cells and uninfected cells are indicated (P adj , 0.10, Dunn's test for nonparametric pairwise multiple comparisons following Kruskal-Wallis test, and false-discovery rate adjustment using Benjamini-Hochberg).
Latent SIV-Infected CD4 1 T Cell Profiling Microbiology Spectrum sequencing in three latently and two productively infected cells confirmed provirus insertion in the host genome. The ratio of latent to productive cellular infection increased from acute to chronic SIV infection and from chronic to cART-suppressed infection. In proof-ofprinciple experiments, we demonstrate feasibility of infected single-cell characterization with respect to host protein and gene expression in acute and chronic infection settings, when infected cells are relatively abundant and well-powered comparisons between cell populations are possible. As expected, infected cells were most frequent during acute infection, when ;40% of all memory CD4 T cells fell into one of the three infection classes. In untreated chronic infection, the frequency was ;1 to 2 logs lower and further reduced by ;1 to 2 logs in cART-suppressed infection. Previous studies of cells harboring integrated SIV DNA in peripheral blood mononuclear cells (PBMC) from acute and chronic SIV infection using qPCR assays on bulk cell populations reported similar infected cell frequencies (54,55), indicating that the infected cell burden detected by our single-cell method is within the expected range. HIV-1 DNA levels in PBMC of acute, chronic, and suppressed HIV-1 infection tend to be lower, generally ;10 2 to 10 3 copies per 10 6 PBMC, or at most 1% of memory CD4 T cells in acute and chronic untreated infection, and 1 to 10 21 copies per 10 6 PBMC in cART-treated patients (67,68). These differences may reflect differences in pathogenesis between human HIV-1 infection and SIV infection in rhesus macaques and could limit feasibility of applying the method to HIV-1 specimens.
A surprising finding was the relatively high frequency of latently infected cells during acute infection, at ;8% of memory CD4 T cells, more than half that of productively infected cells (13%). The ;2:1 ratio of productive/latent infection is striking and suggests that a large pool of latently infected cells may be established very early in acute infection. A large proportion of latently infected cells in vivo is consistent with multiple blocks in HIV-1 transcription limiting mRNA expression, as observed in patient memory CD4 T cells (69,70). However, it is also possible that cells transitioning between an early postintegration stage and initiation of proviral transcription share the viral nucleic acid profile used here to define latent infection, resulting in an overestimation of latent cells by this method. Untreated acute infection might be particularly prone to this issue due to massive viral replication in memory CD4 T cells during this window (71). To limit this likelihood, our method incorporates several criteria to maximize specificity for latently infected cells and limit false positives. First, we defined latent infection as cells with Alu-LTR signal substantially above the background obtained in the absence of Alu primers, reducing the likelihood of including cells harboring unintegrated viral DNA. Second, cells with high genomic vRNA content (in the absence of tat/rev mRNA) were excluded to eliminate infected cells with any evidence of viral transcription. Establishment of a substantial pool of latently infected cells during early acute infection is supported by near universal viral rebound following treatment interruption in clinical and preclinical studies when cART is initiated within days of infection (3,7). Investigation in additional specimens is warranted to corroborate the size of the latent pool during acute infection.
While the proportion of infected cells with intact provirus during untreated acute infection was high, as expected and consistent with early acute HIV-1 infection (72), our estimate of the fraction of latently infected cells harboring intact provirus during cART-suppressed SIV infection was lower than that reported previously, at 16%. In prior analyses, animals started on cART during the first month of infection were found to contain .80% intact proviruses after 9 to 10 months of cART, with lower frequencies observed when cART was initiated after at least a year of infection at 28% and 47% (53,73). It is not clear why intact proviruses were less frequent in our study, in which cART was initiated 8 weeks postinfection, but we believe that methodologic differences are the most likely explanation. First, our analysis was confined to a subset of infected CD4 T cells defined as latently infected by the presence of integrated DNA and lacking viral transcription. Unlike prior studies interrogating total PBMC or CD4 T cells, our analysis may exclude CD4 T cell subsets or other cells enriched for intact genomes. Unintegrated DNA, phagocytosed DNA, and nonlatent cells transcribing integrated provirus, for example, are not represented in our analysis. Second, the efficiency of the single-cell real-time PCR method used here may vary from that of the PCR and near-full-length genome analyses used elsewhere, possibly failing to detect genomes with nucleotide substitution mutations, resulting in false-negative gag or RRE results. It is noteworthy that an analysis of HIV-1 proviruses during untreated Fiebig stage V infection (;30 to 100 days postinfection, similar to when cART was initiated in the cART-treated animal studied here) found only 15 to 30% intact genomes (72). While these results were based on a relatively small sample size, they raise the possibility of deleted genome accumulation after the first month of infection. Such genomes may have persisted in our analysis following a relatively short period of cART. It is also possible that lymph node specimens represent a distinct reservoir relative to that of peripheral blood, though analyses comparing HIV proviruses in matched lymph node and peripheral blood samples have found similar proportions of intact provirus between compartments (74,75), so this seems less likely.
Latently infected and uninfected cells were similar with respect to expression of most host genes and surface proteins studied, though some novel markers showed differential expression. Increased expression of FLIP, a negative regulator of many forms of cell death (65,66), among latently infected cells was observed consistently in acute and chronic infection. We previously reported FLIP upregulation among in vivo SIVinfected CD4 1 T cells lacking spliced viral RNA (48), which we now corroborate for the ipDNA 1 subset of this population. Elevated FLIP may contribute to the persistence of latently infected memory CD4 T cells in vivo by augmenting their survival potential, as has been suggested for other prosurvival factors enriched in HIV-1 infected cells such as BIRC5 and Bcl-2 (76,77). Of note, increased FLIP expression among latently infected cells present during acute infection suggests that enrichment of prosurvival factors may be established early in the course of cellular infection rather than selected for over a prolonged period through viral or immune-mediated clearance mechanisms. Interestingly, interfering with FLIP activity using Tat-vFLIP-a2 peptides induces selective death of latently HIV-1-infected CD4 1 T cells (78), supporting a role for FLIP in infected cell survival. CD3 protein, but not mRNA, expression was also consistently increased among latent cells, which may increase capacity to respond to T-cell receptor (TCR) stimulation and promote clonal expansion (79)(80)(81). Latently infected cells were not enriched for IFITM1 or exhaustion markers TIGIT and PD-1, all previously implicated in HIV-1 latency (30,41). SIV productively infected cells exhibited expected downregulation of surface CD3 and CD4 and modestly increased expression of several activation markers. Clustering analysis of the combined surface marker expression pattern on infected cells did not reveal a profile that defined latently infected cells. These results suggest that latent cellular infection overall is heterogeneous and generally similar to that of uninfected cells, with some notable exceptions. Future studies with a larger sample size of latently infected cells and expanded gene sets will be useful for extending these findings, including features associated with latent cells harboring intact versus defective provirus.
There are several limitations to this study. First, analysis was limited to four animals, only one of which was cART suppressed. Latently infected cells identified during untreated infection may not exhibit the same properties as cells that contribute to persistent viral reservoirs during suppressed infection. In addition, due to the rarity of latently infected cells during suppressed infection, we were able to establish only their viral nucleic acid profile and not their phenotype by single-cell surface or host gene expression profiling. Increasing the assay throughput to interrogate .10 5 cells with single-cell resolution will be required to address these gaps.
Recent advances in high-throughput and high-content single-cell technologies have revolutionized investigations of biological differences between cells, including in the context of HIV-1 infection (41,42,82). Our novel single-cell method offers a valuable strategy for assessing the basic biologic properties that distinguish latently infected cells as they exist in vivo, with several advantages over existing in vitro and ex vivo approaches. First, by analyzing individual cells, we avoid ambiguity regarding interpretation of signals averaged across populations of bulk cells. Second, cells are not manipulated following isolation, with no ex vivo culture or stimulation, preserving transcriptional profiles more reflective of their in vivo state. Third, by combing ipDNA PCR with the high-dimensional analyses of multiplexed RT-qPCR and polychromatic flow cytometry, up to ;150 parameters can be measured for each cell simultaneously in relatively high throughput (e.g., ;100 gene targets by multiplex qPCR; ;50 surface markers using current state-of-the-art flow cytometers). The approach may be particularly useful for validating single-cell transcriptome sequencing (RNA-seq) results with benefits including increased power due to fewer analytes and greater sensitivity for low-copy genes. Of note, this approach is readily adapted to the study of HIV-1-infected cells, as the assays target regions present in both SIV and HIV-1.

MATERIALS AND METHODS
Animals and SIV infection. Four colony-bred Indian-origin male rhesus macaques of ages 3 to 8 years were infected with SIVmac251 administered either intravenously (50% median infective dose [MID 50 ] of 100; VRC animal protocol 356, animal AY69) or intrarectally (50% tissue culture infective dose [TCID 50 ] of 500; Bioqual animal protocol 15-088, animals BS91, T594, and T595). Virus preparations of 1.0 mL were inoculated via the saphenous vein or a lubricated feeding catheter inserted into the rectum. Infection duration was 10 days (AY69, viral load 24,159,248 RNA copies/mL), 15 weeks (T595, ,50 copies/mL), or 7.5 months (T594 and BS91; 33,600 and 175,538 copies/mL, respectively). T595 SIV viral load peaked at 77,156,000 17 days postinfection and fell to 17,450 at week 8, when daily subcutaneous cART consisting of tenofovir disoproxil fumarate (TDF), emtricitabine (FTC), and dolutegravir (DTG) was initiated. Viral load was suppressed to 300 at week 9 and sustained below the assay limit of detection (,50 copies/mL) weeks 10 through 15.  Mattapallil) were sorted by flow cytometry at limiting dilutions in replicate. Cell sorting was performed immediately after specimen thawing and staining without any in vitro manipulation or culture. For lymph node specimens from the untreated macaques, 1,000 to 2,000 memory CD4 T cells were isolated and deposited into 96-well PCR plates at one cell per well. For the cART-suppressed animal, replicates of 10 to 300 cell dilutions were collected and the dilution containing a single infected cell was identified by Poisson distribution probability estimation based on viral nucleic acid positivity of ,33%. Flow cytometry data analysis was performed using FlowJo software (BD Biosciences), v. 9 and 10. t-distributed stochastic neighbor embedding (tSNE) was used to visualize population representation and clustering results with the recommended parameters (perplexity of 30 and iteration of 1,000) in FlowJo v. 10 (83). Markers used in the gating strategy were not included in generating the tSNE plots.
cDNA synthesis, nuclear lysis, and quantitative PCR. Cells were sorted directly into 10 mL RT-PCR buffer in 96-well PCR plates for immediate lysis and gene-specific multiplexed primer-based reverse transcription for 15 min followed by 3 cycles of preamplification using SuperScript III Platinum one-step qRT-PCR kit (ThermoFisher/Invitrogen). mRNA primers (45 nM) were as follows: SIV gag and tat/rev (48) and TaqMan assays (ThermoFisher/Invitrogen) for CD3E (Rh01062242_m1), CD4 (Rh02621720_m1), FLIP (Hs01116280_m1), and CD40LG (Rh02787995_m1). The nuclear membrane was then lysed by addition of 5 mL of 0.3 mg/mL proteinase K (Roche; diluted in water immediately before use) to the cDNA mixture, incubation for 3 h at 55°C, and proteinase K inactivation for 10 min at 95°C. Five microliters of Platinum Taq reaction (ThermoFisher/Invitrogen) was added to the preamplified cDNA and nuclear genomic DNA mixture for further preamplification by PCR for 15 cycles, using the residual RT primers remaining in the reaction and with the addition of primers for Alu(lambda)-LTR (Alu forward: 59-TCCCAGCTACTCG GGAGGCTGAGG, Alu reverse: 59-CTCCCAAAGTGCTGGGATTACAGG, Lambda/LTR: 59-CCTAGCCACG Latent SIV-Infected CD4 1 T Cell Profiling Microbiology Spectrum TAAGCGAAACTAGGGACTAATTTCCATAGCCAGCCAAAT) at 180 nM and 45 nm for Alu and Lambda/LTR primers, respectively. The resulting preamplified cDNA and genomic DNA were diluted 5-fold in DNA suspension buffer (Teknova) and quantified by qPCR performed for 45 cycles on a QuantStudio 6 (ThermoFisher/Applied Biosytems) using Platinum Taq (ThermoFisher/Invitrogen) and primers and probes specific for SIVmac251 gag, tat/rev (48), RRE (forward: 59-GGTGGCACCTCAAGAAATAA, reverse: GCACTATCCCAGCCAATAAA, probe: 59-/56-FAM/AACCCAAGA/ZEN/ACCCTAGCACAAAGACC/3IABkFQ/), lambda-LTR (lambda: 59-CCTAGCCACGTAAGCGAAACTAG, LTR forward: 59-CTTAGAAAAGGAAGAAGGCA TCATACCA, probe: 59-/56-FAM/CAGGATTAC/ZEN/ACCTCAGGACCAGGAATTAGATACCC/3IABkFQ/), and the TaqMan assays for CD3, CD4, CD40LG, and FLIP described above. The RRE assay was absent during RT and preamplification as amplicons encompassing RRE generated from unspliced viral RNA, and viral DNA by tat/rev primers served as the template for the RRE qPCR. qPCR data were converted to E T values (40 2 CT) to reflect relative transcript levels. Amplification efficiency was calculated as E = 21 1 10 (21/slope) . Absolute RNA copies were calculated as follows: copy number = 2^(E T 2 E T minimal ), where minimal E T is 13 for tat/rev and gag, based on the previously determined average value for a single RNA copy by this method (48,49), and minimal E T is 10 for RRE (RNA or DNA), based on the distribution of observed E T values. For host gene expression, E T values of ,13 were pinned at 12 to reflect less than one copy (49).
Sequencing across integration sites. Integrated viral genomic DNA from FACS-sorted single cells preamplified for 15 cycles was further PCR amplified for an additional 45 to 90 cycles using lambda and Alu primers. The resulting PCR products were separated on 2% agarose gel supplemented with SYBR safe DNA gel stain (Invitrogen), and all visible bands were excised manually. DNA was extracted using MinElute gel extraction kit (Qiagen) and sequenced by Sanger method using the Lambda primer (ACGT, Inc.). Unambiguous sequencing reads were aligned with SIVmac239 sequence using Geneious (Biomatters, Inc.) and with M. mulatta using the BLAST/BLAT tool in Ensembl Genome Browser.
Statistical analysis. Differential protein and gene expression between groups of infected cells was determined using Dunn's test for nonparametric pairwise multiple comparisons following Kruskal-Wallis test and false-discovery rate adjustment using Benjamini-Hochberg (P adj , 0.10). Data availability. The data sets that support the findings of this study are available from the corresponding author upon reasonable request.